NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Hybrid Priority Queue and its Applications

https://doi.org/10.1145/3695411.3695417

Li, Zhouzi; Harchol-Balter, Mor; Scheller-Wolf, Alan (September 2024, ACM SIGMETRICS Performance Evaluation Review)

Priority queues are well understood in queueing theory. However, they are somewhat restrictive in that the low-priority customers suffer far greater waiting times than the highpriority customers. In this short paper, we introduce a novel generalization of a two-class priority queue, which we call Hybrid. We prove that Hybrid has a much broader achievability region than strict priority, allowing for a much greater range of waiting time pairs. We demonstrate settings where this new flexibility can increase the revenue obtained by a service system (like airport TSA) selling priority.
more » « less
Full Text Available
The RESET and MARC techniques, with application to multiserver-job analysis

https://doi.org/10.1016/j.peva.2023.102378

Grosof, Isaac; Hong, Yige; Harchol-Balter, Mor; Scheller-Wolf, Alan (November 2023, Performance Evaluation)

Full Text Available
The M/M/k with Deterministic Setup Times

https://doi.org/10.1145/3570617

Williams, Jalani K.; Harchol-Balter, Mor; Wang, Weina (December 2022, Proceedings of the ACM on Measurement and Analysis of Computing Systems)

Capacity management, whether it involves servers in a data center, or human staff in a call center, or doctors in a hospital, is largely about balancing a resource-delay tradeoff. On the one hand, one would like to turn off servers when not in use (or send home staff that are idle) to save on resources. On the other hand, one wants to avoid the considerable setup time required to turn an ''off'' server back ''on.'' This paper aims to understand the delay component of this tradeoff, namely, what is the effect of setup time on average delay in a multi-server system? Surprisingly little is known about the effect of setup times on delay. While there has been some work on studying the M/M/k with Exponentially-distributed setup times, these works provide only iterative methods for computing mean delay, giving little insight as to how delay is affected by k , by load, and by the setup time. Furthermore, setup time in practice is much better modeled by a Deterministic random variable, and, as this paper shows, the scaling effect of a Deterministic setup time is nothing like that of an Exponentially-distributed setup time. This paper provides the first analysis of the M/M/k with Deterministic setup times. We prove a lower bound on the effect of setup on delay, where our bound is highly accurate for the common case where the setup time is much higher than the job service time. Our result is a relatively simple algebraic formula which provides insights on how delay scales with the input parameters. Our proof uses a combination of renewal theory, martingale arguments and novel probabilistic arguments, providing strong intuition on the transient behavior of a system that turns servers on and off.
more » « less
Full Text Available
Optimal Scheduling in the Multiserver-job Model under Heavy Traffic

https://doi.org/10.1145/3570612

Grosof, Isaac; Scully, Ziv; Harchol-Balter, Mor; Scheller-Wolf, Alan (December 2022, Proceedings of the ACM on Measurement and Analysis of Computing Systems)

Multiserver-job systems, where jobs require concurrent service at many servers, occur widely in practice. Essentially all of the theoretical work on multiserver-job systems focuses on maximizing utilization, with almost nothing known about mean response time. In simpler settings, such as various known-size single-server-job settings, minimizing mean response time is merely a matter of prioritizing small jobs. However, for the multiserver-job system, prioritizing small jobs is not enough, because we must also ensure servers are not unnecessarily left idle. Thus, minimizing mean response time requires prioritizing small jobs while simultaneously maximizing throughput. Our question is how to achieve these joint objectives. We devise the ServerFilling-SRPT scheduling policy, which is the first policy to minimize mean response time in the multiserver-job model in the heavy traffic limit. In addition to proving this heavy-traffic result, we present empirical evidence that ServerFilling-SRPT outperforms all existing scheduling policies for all loads, with improvements by orders of magnitude at higher loads. Because ServerFilling-SRPT requires knowing job sizes, we also define the ServerFilling-Gittins policy, which is optimal when sizes are unknown or partially known.
more » « less
Full Text Available
WCFS: a new framework for analyzing multiserver systems

https://doi.org/10.1007/s11134-022-09848-6

Grosof, Isaac; Harchol-Balter, Mor; Scheller-Wolf, Alan (October 2022, Queueing Systems)

Full Text Available
The multiserver job queueing model

https://doi.org/10.1007/s11134-022-09762-x

Harchol-Balter, Mor (April 2022, Queueing Systems)

Full Text Available
The most common queueing theory questions asked by computer systems practitioners

https://doi.org/10.1145/3543146.3543148

Harchol-Balter, Mor; Scully, Ziv (June 2022, ACM SIGMETRICS Performance Evaluation Review)

This document examines five performance questions which are repeatedly asked by practitioners in industry: (i) My system utilization is very low, so why are job delays so high? (ii) What should I do to lower job delays? (iii) How can I favor short jobs if I don't know which jobs are short? (iv) If some jobs are more important than others, how do I negotiate importance versus size? (v) How do answers change when dealing with a closed-loop system, rather than an open system? All these questions have simple answers through queueing theory. This short paper elaborates on the questions and their answers. To keep things readable, our tone is purposely informal throughout. For more formal statements of these questions and answers, please see [14].
more » « less
Full Text Available
Open problems in queueing theory inspired by datacenter computing

https://doi.org/10.1007/s11134-020-09684-6

Harchol-Balter, Mor (February 2021, Queueing Systems)
null (Ed.)
Full Text Available
To clean or not to clean: Malware removal strategies for servers under load

https://doi.org/10.1016/j.ejor.2020.10.036

Doroudi, Sherwin; Avgerinos, Thanassis; Harchol-Balter, Mor (July 2021, European Journal of Operational Research)
null (Ed.)
Full Text Available
Zero Queueing for Multi-Server Jobs

https://doi.org/10.1145/3410220.3453924

Wang, Weina; Xie, Qiaomin; Harchol-Balter, Mor (May 2021, ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems)
null (Ed.)
Cloud computing today is dominated by multi-server jobs. These are jobs that request multiple servers simultaneously and hold onto all of these servers for the duration of the job. Multi-server jobs add a lot of complexity to the traditional one-server-per-job model: an arrival might not "fit" into the available servers and might have to queue, blocking later arrivals and leaving servers idle. From a queueing perspective, almost nothing is understood about multi-server job queueing systems; even understanding the exact stability region is a very hard problem. In this paper, we investigate a multi-server job queueing model under scaling regimes where the number of servers in the system grows. Specifically, we consider a system with multiple classes of jobs, where jobs from different classes can request different numbers of servers and have different service time distributions, and jobs are served in first-come-first-served order. The multi-server job model opens up new scaling regimes where both the number of servers that a job needs and the system load scale with the total number of servers. Within these scaling regimes, we derive the first results on stability, queueing probability, and the transient analysis of the number of jobs in the system for each class. In particular we derive sufficient conditions for zero queueing. Our analysis introduces a novel way of extracting information from the Lyapunov drift, which can be applicable to a broader scope of problems in queueing systems.
more » « less
Full Text Available

« Prev Next »

Search for: All records